-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Avoid _like
function in Chunking
#340
Conversation
Doesn't seem to work yet
|
openpmd_viewer/openpmd_timeseries/data_reader/io_reader/utilities.py
Outdated
Show resolved
Hide resolved
openpmd_viewer/openpmd_timeseries/data_reader/io_reader/utilities.py
Outdated
Show resolved
Hide resolved
openpmd_viewer/openpmd_timeseries/data_reader/io_reader/utilities.py
Outdated
Show resolved
Hide resolved
@AlexanderSinn thx, fixed the constructor now. Can you please try again? Can you please also post me a reproducer for the exact HiPACE compilation flags & inputs set that shows the problem? |
It now works again but its still very slow compared to h5py (actually 10 times slower than development, now 20 minutes instead of 1-2 minutes for the ~200MB h5 file, while h5py takes 3 seconds). To reproduce go to
And change
to
Then compile hipace for cpu (
|
This is the file https://syncandshare.desy.de/index.php/s/dX77jPNAofgwMwJ |
openpmd_viewer/openpmd_timeseries/data_reader/io_reader/utilities.py
Outdated
Show resolved
Hide resolved
openpmd_viewer/openpmd_timeseries/data_reader/io_reader/utilities.py
Outdated
Show resolved
Hide resolved
# mask invalid regions with NaN | ||
data = np.full_like(record_component, np.nan) | ||
for chunk in chunks: | ||
chunk_slice = chunk_to_slice(chunk) | ||
print(f"chunk: {chunk}, {chunk_slice}") | ||
# read only valid region | ||
x = record_component[chunk_slice] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is related to openPMD/openPMD-api#1225
Prior to #332 we did not copy an extra time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, @AlexanderSinn found that x = record_component[()]
is fast, but it becomes slow the moment we have the data = np.full_like(record_component, np.nan)
above.
We could try to make the above line a np.full
with shape and dtype.
_like
function in Chunking
13665fc
to
32696e7
Compare
32696e7
to
0a31e5a
Compare
When we prepare chunked reads, we assume a single chunk for all backends but ADIOS2. Preparing the returned data, we use `data = np.full_like(record_component, np.nan)`. It turns out that numpy seems to trigger a `__getitem__` access or full copy of our `record_component` at this point, which causes severe slowdown. This was first seen for particles, but affects every read where we do not slice a subset. Co-authored-by: AlexanderSinn <[email protected]>
0a31e5a
to
fd41324
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it works and is just as fast as h5py!
Thank you so much for your help, Alex. Much appreciated 💖 |
@AlexanderSinn @ax3l Thank you so much for figuring this out, and for fixing it! ✨ |
When we prepare chunked reads, we assume a single chunk for all backends but ADIOS2.
Preparing the returned data, we use
data = np.full_like(record_component, np.nan)
. It turns out that numpy seems to trigger a__getitem__
access or full copy of ourrecord_component
at this point, which causes severe slowdown.Refs.:
This was first seen for particles, but affects every read where we do not slice a subset.
Thanks a lot to @AlexanderSinn for debugging this with me ✨
This fixes the performance regression that @AlexanderSinn, @MaxThevenet and @SeverinDiederichs saw in Hi-PACE/hipace#725
Regression to #332 #334